Zero Downtime Deployment with Docker Swarm
If you are a software developer that already worked with a production software, for sure you know the struggle:
deployment time
Hope you already resolved the trust your deployment issue with great practices as Continuous Integration, Automated Testing, Continuous Delivery and so on. Today we’ll talk about zero downtime.
the problem
You already packaged your application into a docker image, published it into a docker registry, started it on your production server with:
docker run ...
Nothing wrong, your application is up and running. But… how can we update it?
The easiest solution is to stop the old one and start the new version
docker stop <running container id>
docker run <new version>
There’s a problem with this approach: from the actual instance stop and the complete bootstrap of the new version, you application will not respond.
the solution
This problem is pretty common and can be resolved in many ways. We discovered an almost “effort free” way using Docker Swarm.
what is Docker Swarm?
Swarm is a Docker “mode” already included in your Docker installation. It’s a powerful cluster engine that will help you scale your application.
And, it will resolve your downtime problems.
how to?
The idea is to transform your docker instance in a single node swarm cluster.
docker swarm init
This command should return something like
Swarm initialized: current node (<node_id>) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token <swarm_token> <node_addess+port>
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
These are instructions for adding nodes to our cluster. But you don’t need that now.
Now you need to deploy our stack.
To do that define a docker-compose file like this:
version: '3.7'
networks:
my-network:
external: false
services:
my-server:
image: ${IMAGE}
hostname: my-server
container_name: my-server
ports:
- '8080:8080'
networks:
- dao-network
healthcheck:
test: ["CMD", "curl", "-i", "--fail", "http://localhost:8080/health"]
deploy:
mode: replicated
replicas: 2
update_config:
order: start-first
failure_action: rollback
delay: 5s
In this example there is just one instance, but you can deploy as many as you need.
Some concepts:
- image: is a variable that represents the image name and version
- network: is the stack network, with this various services can comminuicate with each other
- healthcheck: very important to define a command that verifies the service status. It should return
error code = 0
when it’s working anderror code != 0
when it’s not started yet, stopped, paused, etc.. - replicas: how many parallel instances of the service will be deployed
To start your stack run the command:
export IMAGE=${IMAGE_NAME}:${IMAGE_VERSION}
docker stack deploy -c docker-compose.yml <stack_name> --with-registry-auth
Now the service will start, at this point with the same command, changing the value of IMAGE_VERSION you can update your app without downtime, swarm will take care of starting the new instances and, once started correctly, take care of stopping the old ones.